Capacity Pre-Log of Noncoherent SIMO 
Channels via Hironaka's Theorem 

Veniamin I. Morgenshtern, Erwin Riegler, 
Wei Yang, Giuseppe Durisi, 
Shaowei Lin, Bernd Sturmfels, and Helmut Bolcskei 



Abstract — We find the capacity pre-log of a temporally corre- 
lated Rayleigh block-fading single-input multiple-output (SIMO) 
channel in the noncoherent setting. It is well known that for block- 
length L and rank of the channel covariance matrix equal to Q, 
the capacity pre-log in the single-input single-output (SISO) case 
is given by 1 — Q/L. Here, Q/L can be interpreted as the pre- 
log penalty incurred by channel uncertainty. Our main result re- 
veals that, by adding only one receive antenna, this penalty can be 
reduced to 1 /L and can, hence, be made to vanish for the block- 
length L — > oo, even if Q/L remains constant as L —} oo. Intu- 
itively, even though the SISO channels between the transmit an- 
tenna and the two receive antennas are statistically independent, 
the transmit signal induces enough statistical dependence between 
the corresponding receive signals for the second receive antenna to 
be able to resolve the uncertainty associated with the first receive 
antenna's channel and thereby make the overall system appear 
coherent. The proof of our main theorem is based on a deep result 
from algebraic geometry known as Hironaka's Theorem on the 
Resolution of Singularities. 



I. Introduction 

It is well known that the capacity pre-log, i.e., the asymptotic 
ratio between capacity and the logarithm of signal-to-noise ratio 
(SNR), as SNR goes to infinity, of a single-input multiple-output 
(SIMO) fading channel in the coherent setting (i.e., when the 
receiver has perfect channel state information (CSI)) is equal 
to 1 and is, hence, the same as that of a single-input single- 
output (SISO) fading channel [4|. This result holds under very 
general assumptions on the channel statistics. Multiple antennas 
at the receiver only, hence, do not result in an increase of the 
capacity pre-log in the coherent setting [4|. In the noncoherent 
setting, where neither transmitter nor receiver have CSI, but both 
know the channel statistics, the effect of multiple antennas on the 
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capacit)|Jpre-log is understood only for a specific simple channel 
model, namely, the Rayleigh constant block-fading model. In 
this model the channel is assumed to remain constant over a 
block (of L symbols) and to change in an independent fashion 
from block to block [5|. The corresponding SIMO capacity pre- 
log is again equal to the SISO capacity pre-log, but, differently 
from the coherent setting, is given byl — 1/Lj6j,|[7j. 

An alternative approach to capturing channel variations in time 
is to assume that the fading process is stationary. In this case, the 
capacity pre-log is known only in the SISO [8 1 and the multiple- 
input single-output (MISO) (9] Thm. 4.15] cases. The capacity 
bounds for the SIMO stationary-fading channel available in the 
literature |9 Thm. 4.13] do not allow to determine whether the 
capacity pre-log in the SIMO case equals that in the SISO case. 
Resolving this question for stationary fading seems elusive at 
this point. 

A widely used channel model that can be seen as lying in 
between the stationary-fading model considered in [8], [9], 
and the simpler constant block-fading model analyzed in (51, 
J7| is the correlated block-fading model, which assumes that 
the fading process is temporally correlated within blocks of 
length L and independent across blocks. The L x L channel 
covariance matrix of rank Q < L is taken to be the same for 
each block. This channel model is relevant as it captures channel 
variations in time in an accurate yet simple fashion: the rank Q 
of the covariance matrix corresponds to the minimum number 
of channel coefficients per block that need to be known at the 
receiver to perfectly reconstruct all channel coefficients within 
the same block. Therefore, larger Q/L corresponds to faster 
channel variations. 

The SISO capacity pre-log for correlated block-fading chan- 
nels is given by 1 — Q/L [10]. In the SIMO and the multiple-input 
multiple-output (MIMO) cases the capacity pre-log is unknown. 
The main contribution of this paper is a full characterization of 
the capacity pre-log for SIMO correlated block-fading channels. 
Specifically, we prove that under a mild technical condition on 
the channel covariance matrix, the SIMO capacity pre-log, of 
a channel with R receive antennas and independent identically 
distributed (i.i.d.) SISO subchannels is given by 

X = mm[l-l/L,R(l-Q/L)}. (1) 

This shows that even with R = 2 receive antennas a capacity 
pre-log of 1 — 1/L can be obtained in the SIMO case (provided 

'In the remainder of the paper, we consider the noncoherent setting only. 
Consequently, we will refer to capacity in the noncoherent setting simply as 
capacity. 
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that L > 2Q — 1). This capacity pre-log is strictly larger than 
the capacity pre-log of the corresponding SISO channel (i.e., the 
capacity pre-log of one of the component channels), given by 
1 — Q/L. Here Q/L can be interpreted as pre-log penalty due 
to channel uncertainty. Our result reveals that, by adding at least 
one receive antenna, this penalty can be made to vanish in the 
large block-length limit, L — > oo, even if the amount of channel 
uncertainty scales linearly in the block-length. 

A conjecture for the correlated block-fading channel model 
stated in [10] for the MIMO case, when particularized to the 
SIMO case, implies that the capacity pre-log in the SIMO case 
would be the same as that in the SISO case. As a consequence 
of (JTJ) this conjecture is disproved. 

In terms of the technical aspects of our main result, we 
sandwich capacity between an upper and a lower bound that turn 
out to be asymptotically (in SNR) tight (in the sense of delivering 
the same capacity pre-log). The upper bound is established by 
proving that the capacity pre-log of a correlated block-fading 
channel with R receive antennas can be upper-bounded by 
the capacity pre-log of a constant block-fading channel with 
RQ receive antennas and the same SNR. The derivation of the 
capacity pre-log lower bound poses serious technical challenges. 
Specifically, after a change of variables argument applied to the 
integral expression for the differential entropy of the channel 
output signal, the main technical difficulty lies in showing that the 
expected logarithm of the Jacobian determinant corresponding 
to this change of variables is finite. As the lacobian determinant 
takes on a very involved form, a per pedes approach appears 
infeasible. The problem is resolved by first distilling structural 
properties of the determinant through a suitable factorization 
and then introducing a powerful tool from algebraic geometry, 
namely |11| Th. 2.3], which is a consequence of Hironaka's 
Theorem on the Resolution of Singularities [12], [ 13 1. Roughly 
speaking, this result allows to rewrite every real analytic function 
p"4] Def. 1.1.5, Def. 2.2.1] locally as a product of a monomial 
and a nonvanishing real analytic function. This factorization 
is then used to show that the integral of the logarithm of the 
absolute value of a real analytic function over a compact set is 
finite, provided that the real analytic function is not identically 
zero. This method is quite general and may be of independent 
interest when one tries to show that integrals of certain functions 
with singularities are finite, in particular, functions involving 
logarithms. In information theory such integrals often occur 
when analyzing differential entropy. 

Notation: Sets are denoted by calligraphic letters A, B, . . . 
Roman letters A, B, . . . and a, b, . . . designate deterministic 
matrices and vectors, respectively. Boldface letters A, B, . . . 
and a, b, . . . denote random matrices and random vectors, re- 
spectively. We let be the vector (of appropriate dimension) 
that has the ith entry equal to one and all other entries equal to 
zero, and denote the M x M identity matrix as 1m- The element 
in the ith row and jth column of a deterministic matrix A is 
dij (italic letters), and the ith component of the deterministic 
vector u is Ui (italic letters); the element in the ith row and jth 
column of a random matrix A is ay (sans serif letters), and the 
ith component of the random vector u is (sans serif letters). 
For a vector u, diag(u) stands for the diagonal matrix that has the 
entries of u on its main diagonal. The linear subspace spanned 



by the vectors ui, . . . , u„ is denoted by spanjui, . . . , u„}. The 
superscripts T and H stand for transposition and Hermitian trans- 
position, respectively. For two matrices A and B, we designate 
their Kronecker product as A <g) B; to simplify notation, we use 
the convention that the ordinary matrix product precedes the 
Kronecker product, i.e., AB (g) C = (AB) ® C. For a finite 
subset of the set of natural numbers, I C N, we write |l| 
for the cardinality of I. For an M x N matrix A, and a set 
of indices I c [1:M], we use Ax to denote the \X\ x N 
submatrix of A containing the rows of A with indices in I. 
For two matrices A and B of arbitrary size, diag(A, B) is the 
2x2 block-diagonal matrix that has A in the upper left corner 
and B in the lower right corner. For N matrices Ai, . . . , Ajv> 
we let diag(A!, . . . , A N ) = diag(diag(A!, . . .,A N _ 1 ),A N ). 
The ordered eigenvalues of the N x N matrix A are denoted 
by Ai(A) > • •• > Ajv(A). For two functions /(•) and g(-), 
the notation /(■) = 0(g(-)) means that liniu^oo \f(u)/g(u)\ 
is bounded. For a function /(■), we say that /(•) is not iden- 
tically zero and write /(•) ^ if there exists at least one 
element u in the domain of /(•) such that /(u) ^ 0. We 
say that a function /(•) is nonvanishing on a subset S of its 
domain, if for all u € S, /(u) ^ 0. For two functions 
/(•) and g(-), (/ o g)(-) denotes the composition /(<?(•))• For 
x G K, \x~\ = min{rn G Z | m > x}. We use [n:rn] to 
designate the set of natural numbers {n, n + 1, . . . , m}. Let 
g : C M — > C N , u n- g(u), be a vector- valued function; then 
dg/du denotes the N x M Jacobian matrix [ 15 . Def. 3.8] of the 
function g(-), i.e., the matrix that contains the partial derivative 
dg jduj in its ith row and jth column. The logarithm to the 
base 2 is written as log(-). For sets A,B C R M , we define 
A ± B = {a ± b | a G A,h G B}. If A = {a}, then 
a ± B = A ± B. With (-e,e) = {u G R \ \u\ < e}, we 
denote by C(u, e) = u + (— e, e) C H. the open cube in 
K M with side length 2e centered at u G K M . The set of natural 
numbers, including zero, is No- For u G C and m G Nq , 
we let u m = u™ 1 . . . u M M . If A is a subset of the image of 
a map /(•) then f^ 1 (A) denotes the inverse image of A. The 
expectation operator is designated by E [•] . For random matrices 

A and B, we write A ~ B to indicate that A and B have the 
same distribution. Finally, CAf(u, C) stands for the distribution 
of a jointly proper Gaussian (JPG) random vector with mean u 
and covariance matrix C. 

II. System Model 

We consider a SIMO channel with R receive antennas. The 
fading in each SISO component channel follows the correlated 
block-fading model described in the previous section. The input- 
output (IO) relation within any block of length L for the mth 
SISO component channel can be written as 

Ym = VP dia g( h m)x + w m , m € [1 :R], (2) 

where x = [xi • • • x^] T G C L is the signal vector transmitted 
in the given block, and the vectors y m ,w m G C L are the 
corresponding received signal and additive noise, respectively, 
at the mth receive antenna. Finally, h m G C L contains the 
channel coefficients between the transmit antenna and the mth 
receive antenna. We assume that h m ~ CJ\f(0, DD H ), for all 
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m € [1 : R], where D G C Lx< 2 (which is the same for all blocks 
and all component channels) has rank Q < L. The entries of 
the vectors h m are taken to be of unit variance, which implies 
that the main diagonal entries of DD H are equal to 1 and the 
average received power is constant across time slots. It will 
turn out convenient to write the channel coefficient vector in 
whitened form as h m = Ds m , where s m ~ CAf(0, Iq). Further, 
we assume that w m ~ CAf(0, II). As the noise vector has unit 
variance components, p in Q can be interpreted as the SNR. 
Finally, we assume that s m and w m are mutually independent, 
independent across m, and change in an independent fashion 
from block to block. Note that for Q — 1 the correlated block- 
fading model reduces to the constant block-fading model as used 
in (6), Q. 



With y 
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and X = diag(x), we can write the IO rela- 
tion (|2]) in the following — more compact — form 

y = y/p (I R gi XD) s + w. (3) 

The capacity of the channel (|3]l is defined as 

C(p)4(l/£)sup/(x;y), (4) 
/*(•) 

where the supremum is taken over all input distributions / x (-) 
that satisfy the average-power constraint 

21 



E 



2 ] <L. 



(5) 



The capacity pre-log, the central quantity of interest in this paper, 
is defined as 

C(p) 



X 



p^>oo log(p) 



III. Intuitive Analysis 

We start with a simple "back-of-the-envelope" calculation that 
allows to develop some intuition on the main result in this paper, 
summarized in (|TJ. The different steps in the intuitive analysis 
below will be seen to have rigorous counterparts in the formal 
proof of the capacity pre-log lower bound detailed in Section VI 

The capacity pre-log characterizes the channel capacity be- 
havior in the regime where additive noise can "effectively" be 
ignored. To guess the capacity pre-log, it therefore appears 
prudent to consider the problem of identifying the transmit 
symbols Xj, i G [1 : L] , from the noise-free (and rescaled) 
observation 

y=(I fl ®XD)s. (6) 

Specifically, we shall ask the question: "How many symbols 
can be identified uniquely from y given that the vector of channel 
coefficients s is unknown but the statistics of the channel, i.e., the 
matrix D, are known?" The claim we make is that the capacity 
pre-log is given by the number of identifiable symbols divided 
by the block length L. 

We start by noting that the unknown variables in |6) are s and 
x, which means that we have a quadratic system of equations. It 
turns out, however, that the simple change of variables 



Zj = 1/xj, ie [l:L] 



(7) 



(we make the technical assumption |xj| > 0, i G [1:£], in 
the remainder of this section) transforms |6]) into a system of 



equations that is linear in s and Zj, i € [1 : L\. Since the trans- 
formation Zj = 1/xj is invertible for Xf > 0, uniqueness of the 
solution of the linear system of equations in s and z^, i G [1 : L], 
is equivalent to uniqueness of the solution of the quadratic system 
of equations in s and Xj, i G [1 : L]. 

For concreteness and simplicity of exposition, we first con- 
sider the case L = 3 and R = Q = 2 and assume that D 
satisfies the technical condition specified in Theorem [T| stated 
in Section IV A direct computation reveals that upon change 
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of variables according to d7]|, the quadratic system |6]) can be 
rewritten as the following linear system of equations: 

si 
s 2 

S3 

s 4 = 0. (8) 
-zi 

-Z2 

r z 3. 

The solution of ([8]) can not be unique, as we have 6 equations 
in 7 unknowns. The = 1/z^, i G [1:3], can, therefore, not 
be determined uniquely from y. We can, however, make the 
solution of ([8| to be unique if we devote one of the data symbols 
Xj to transmitting a pilot symbol (known to the receiver). Take, 
for concreteness, Xi = 1. Then ([8} reduces to the following 
inhomogeneous system of 6 equations in 6 unknowns 



(9) 
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This system of equations has a unique solution if det B ^ 0. We 
prove in Appendix IO that under the technical condition on D 
specified in TheoremTT] stated in Section IV we, indeed, have 
that det B ^ for almost alj^jy'2, y3, ys, y6- It, therefore, follows 
that for almost all y, the linear system of equations |9| has a 
unique solution. As explained above, this implies uniqueness of 
the solution of the original quadratic system of equations d5]l. 
We can therefore recover Z2 and Z3, and, hence, X2 = I/Z2 and 
X3 = 1/ Z3 from y . Summarizing our findings, we expect that the 
capacity pre-log of the channel |3j, for the special case L = 3 
and R = Q = 2, is equal to 2/3, which is larger than the capacity 
pre-log of the corresponding SISO channel (i.e., one of the SISO 
component channels), given by 1 — Q/L = 1/3 [10] . This 
answer, obtained through the back-of-the-envelope calculation 
above, coincides with the rigorous result in Theorem [T] 

We next generalize what we learned in the example above 
to L, R, and Q arbitrary, and start by noting that if (X, s) is a 
solution of y = (I# <X> XD) s for fixed y, then (aX, s/a) with 
a G C is also a solution of this system of equations. It is therefore 
immediately clear that at least one pilot symbol is needed to make 
this system of equations uniquely solvable. 

2 Except for a set of measure zero. 
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To guess the capacity pre-log for general parameters L, R, 
and Q, we first note that the homogeneous linear system of 
equations corresponding to that in ([8]), has RL equations for 
RQ + L unknowns. As the example above indicates, we need 
to seek conditions under which this homogeneous linear system 
of equations can be converted into a linear system of equations 
that has a unique solution. Provided that D satisfies the technical 
condition specified in Theorem[T]below, this entails meeting the 
following two requirements: (i) at least one symbol is used as a 
pilot symbol to resolve the scaling ambiguity described in the 
previous paragraph; (ii) the number of unknowns in the system 
of equations corresponding to that in ([8} must be smaller than 
or equal to the number of equations. To maximize the capacity 
pre-log we want to use the minimum number of pilot symbols 
that guarantees (i) and (ii). In order to identify this minimum, 
we have to distinguish two cases: 

1) When RL < RQ + L [in this case min[l - 1/L, R(l - 
Q/L)} = R(l-Q/L)] we will need at least RQ + L-RL 
pilot symbols to satisfy requirement (ii). Since RQ + L — 
RL > 1, choosing exactly RQ + L — RL pilot symbols will 
satisfy both requirements. The number of symbols left for 
communication will, therefore, be L — (RQ + L — RL) = 
R(L — Q). Hence, we expect the capacity pre-log to be 
given by R(l — Q/L), which agrees with the result stated 
inQ. 

2) When RL > RQ + L [in this case min[l - 1/L, R{1 - 
Q/L)} = 1 — 1/L], we will need at least one pilot symbol 
to satisfy requirement (i). Since requirement (ii) is satisfied 
as a consequence of RL > RQ + L, it suffices to choose 
exactly one pilot symbol. The number of symbols left for 
communication will, therefore, be L — 1 and we hence 
expect the capacity pre-log to equal 1 — 1/L, which again 
agrees with the result stated in (QJ. Note that the resulting 
inhomogeneous linear system of equations has RL equa- 
tions in RQ + L—1 unknowns. As there are more equations 
than unknowns, RL — RQ — L + 1 equations are redundant 
and can be eliminated. 

The proof of our main result, stated in the next section, 
will provide rigorous justification for the casual arguments put 
forward in this section. 

IV. The Capacity Pre-Log 
The main result of this paper is the following theorem. 

Theorem 1. Suppose that D satisfies the following 

Property (A): Every Q rows ofD are linearly independent. 
Then, the capacity pre-log of the SIMO channel ^ is given 

by 

X = min[l- 1/L, R(l- Q/L)). (10) 

Remark 1. We will prove Theorem[T]by showing, in Section [V] 
that the capacity pre-log of the SIMO channel <(3j can be upper- 
bounded as 

X <min[l-1/L,R(1-Q/L)] (11) 
and by establishing, in Section [VI] the lower bound 

X > min[l - 1/L, R(l - Q/L)]. (12) 



While the upper bound ( [TT] ) can be shown to hold even if D does 
not satisfy Property (A), this property is crucial to establish the 
lower bound ( |12) , 

Remark 2. The lower bound ([12} continues to hold if 
Property (A) is replaced by the following milder condition on D. 

Property (A'): There exists a subset of indices JC C [1:L] 
with cardinality 



i(\(RQ-l)/(R-l)],L) 



such that every Q rows of are linearly independent. 

We decided, however, to state our main result under the 
stronger Property (A) as both Property (A) and Property (A') 
are very mild and the proof of the lower bound (12i under 



Property (A') is significantly more cumbersome and does not 
contain any new conceptual aspects. A sketch of the proof of the 
stronger result (i.e., under Property (A')) can be found in 

We proceed to discussing the significance of Theorem[T] 

A. Eliminating the prediction penalty 

According to ( 10 1 the capacity pre-log of the SIMO channel Q 
with R = 2 receive antennas is given by \ = 1 — 1/L, provided 
that Property (A) holds, and L > 2Q — 1. Comparing to the 
capacity pre-log xsiso = 1 — Q/L in the SISO casej^] [ 1 1 
(this result also follows from ( fT0| ) with R = 1), we see that — 
under a mild condition on the channel covariance matrix D — 
adding only one receive antenna yields a reduction of the channel 
uncertainty-induced pre-log penalty from Q/L to 1/L. How 
significant is this reduction? Recall that Q is the number of 
uncertain channel parameters within each given block of length 
L. Hence, the ratio between the rank of the covariance matrix 
and the block-length, Q/L, is a measure that can be seen as 
quantifying the amount of channel uncertainty relative to the 
number of degrees of freedom for communication. It often makes 
sense to consider L — > oo with the amount of channel uncertainty 
Q/L held constant. For concreteness, consider L, Q — > oo with 
L = 2Q — 1 so that Q/L — > 1/2. The capacity pre-log penalty 
due to channel uncertainty in the SISO case is then given by 1 /2. 
Theorem[T]reveals that, by adding a second receive antenna, this 
penalty can be reduced to 1/L and, hence, be made to vanish in 
the limit L — > oo. Intuitively, even though the SISO channels 
between the transmit antenna and the two receive antennas are 
statistically independent, the transmit signal induces enough 
statistical dependence between the corresponding receive signals 
for the second receive antenna to be able to resolve the channel 
uncertainty associated with the first receive antenna's channel 
and thereby make the overall system appeal" coherent. 

B. Number of receive antennas 

Note that for Q < i, we can rewrite ( |T~0| > as 

X = min[l - 1/L, R(l - Q/L)} 
(l-l/L, if JJ 

[R(l-Q/L), else. 
As illustrated in Fig.[T] it follows from ( 13 1 that for fixed L and Q 



(13) 



3 Note that the results in |lp| are stated for general channel covariance 
matrix D. 
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Fig. 1. The capacity pre-log of the SIMO channel J3J. 

with Q < L the capacity pre-log of the SIMO channel Q grows 
linearly with R as long as R is smaller than the critical value 
\(L — 1)/(L — Q)~\ . Once R reaches this critical value, further 
increasing the number of receive antennas does not increase the 
capacity pre-log. 

C. Property (A) is mild 

Property (A) is not very restrictive and is satisfied by many 
practically relevant channel covariance matrices D. For example, 
removing an arbitrary set of L — Q columns from an L x L 
discrete Fourier transform (DFT) matrix results in a matrix that 
satisfies Property (A) when L is prime |16 ). (Weighted) DFT 
covariance matrices arise naturally in so-called basis-expansion 
models for time-selective channels (lOj. 

Property (A) can furthermore be shown to be satisfied by 
"generic" matrices D. Specifically, if the entries of D are cho- 
sen randomly and independently from a continuous distribu- 
tion (17, Sec. 2-3, Def. (2)] (i.e., a distribution with a well- 
defined probability density function (PDF)), then the resulting 
matrix D will satisfy Property (A) with probability one. The 
proof of this statement follows from a union bound argument 
together with the fact that N independent A^-dimensional vectors 
drawn independently from a continuous distribution are linearly 
independent with probability one. 

V. Proof of the Upper Bound ( fTTj ) 
The proof of ( fTT) consists of two parts. First, in Section V-A 



we prove that \ < R(l ~ Q/L). This will be accomplished by 
generalizing — to the SIMO case — the approach developed in 1 10 , 
Prop. 4] for establishing an upper bound on the SISO capacity 
pre-log. Second, in Section V-B we prove that x < 1 — 1/L 



by showing that the capacity of a SIMO channel with R receive 
antennas and channel covariance matrix of rank Q can be upper- 
bounded by the capacity of a SIMO channel with RQ receive 
antennas, the same SNR, and a rank-1 covariance matrix. The 
desired result, x < 1 — 1/ L, then follows by application of j7] 



Eq. (27)], 1 18 Eq. (7)] as detailed below. 



A. First part: \ < #(1 - Q/L) 

To simplify notation, we first rewrite ([3]) as 

Y = ^diag(x)DS + W, 



where Y = [y 1 ■ ■ ■ y R ], H = [h x • • • h R ], W = [wi • • • w R ], 
and S = [si • • • s^]. 

Recall that D has rank Q. Without loss of generality, we 
assume, in what follows, that the first Q rows of D are linearly 
independent. This can always be ensured by reordering the scalar 
IO relations in (||. With Q = [1 : Q] and C = [Q + 1 : L] we can 
write 

7(Y;x) = I(Y Q ,Y £ ;x) 

( = ) /(Y Q ;x)+/(Y £ ;x|Y Q ) 

7(Y Q ; x Q ) + I(Y Q ; x £ | x Q ) +/(Y £ ; x | Y Q ) 



('') 



7(Y s ;x Q )+/(Y £ ;x|Y Q ), 



(15) 



where (a) and (b) follow by the chain rule for mutual information 
and in (c) we used that Yq and x £ are independent conditional 



on xq. Next, we upper-bound each term in ( 15 1 separately. 

From [19 Thm. 4.2] we can conclude that the assumption 
of the first Q rows of D being linearly independent implies 



that the first term on the RHS of ( 15 i grows at most double 



logarithmically with SNR and hence does not contribute to 
the capacity pre-log. For the reader's convenience, we repeat 



the corresponding brief calculation from [19 Thm. 4.2] in 
Appendix [A| and show that: 

/(Y e ;x c ) <Qloglog(p) + 0(l). (16) 

Here and in what follows, 0(1) refers to the limit p — > oo. 
For the second term in ( p"5| ) we can write 

J(Y £ ; x | Yq) = />(Y £ | Yq) - fr(Y £ I *, Yq) 

( </i(Y £ )-MY £ |x,Yq,s) 
= h(Y c ) - h(W c ) 



[ ihiyir) - h{\N lr )) 



l=Q+l r=l 

< £ f>g(i + p£[M 2 MM 2 ]) 

;=Q+1 r=l 

^ E E lo s( 1+L <° E DM 2 ]) 



^ R(L-Q)\og(p)+0(l), 



(17) 



where in (a) we used the fact that conditioning reduces entropy; 
(b) follows from the chain rule for differential entropy and 
the fact that conditioning reduces entropy; (c) follows because 
Gaussian random variables are differential-entropy-maximizers 
for fixed variance and because h; r and x; are independent; (d) 
is a consequence of the power constraint Q; and (e) follows 
because E [|h/ r | 2 ] =1. 

Combining ( (15) , ( fl6] ), and ( fTT) yields 

C{ P ) < H(l-Q/i)log(p) + (Q/i)loglog(p) + 0(l). (18) 

Since limp^oo log log(p) / log(p) = 0, this completes the proof 
(14) of the bound X <R0--Q/L). 
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It follows from ( |T8] l that for Q = L, the capacity pre-log is 
zero and C(p) can grow no faster than double-logarithmically 
in p. 

Recall that 1 — Q/L is the capacity pre-log of the correlated 
block-fading SISO channel 1 1 1 . As the proof of the upper bound 
X < R(l — Q/L) reveals, the capacity pre-log of the SIMO 
channel Q can not be larger than R times the capacity pre-log of 
the corresponding SISO channel (i.e., the capacity pre-log of one 
of the SISO component channels). The upper bound R(l — Q/L) 
may seem crude, but, surprisingly, it matches the lower bound 
farR<\(L-l)/(L-QJ]. 

B. Second part: \ < 1 — 1/L 

The proof of % < 1 — 1/L will be accomplished in two steps. 
In the first step, we show that the capacity of a SIMO channel 
with R receive antennas and rank-Q channel co variance matrix 
is upper-bounded by the capacity of a SIMO channel with RQ 
receive antennas, the same SNR, and rank-1 covariance matrix. 



In the second step, we exploit the fact that the channel ( 14 1 with 



rank-1 covariance matrix (under the assumption that the rows of 
D have unit norm) is a constant block-fading channel for which 
the capacity pre-log was shown in |7J to equal 1 — 1/L. We now 
implement the proof program just outlined. 

Let di , . . . , d.Q £ C L denote the columns of the LxQ matrix 
D so that D = [di • • • dg]. Let s\, . . . , sq £ C R denote the 
transposed rows of the Q x R matrix S so that S T = [§i • • • Sq]. 
We can rewrite the 10 relation (jT4j» in the following form that is 
more convenient for the ensuing analysis: 

Q 

Y = VP$>iag(d g )xlJ + W. 

q=l 

Let Wi , . . . , Wq be independent random matrices of dimension 
L x R, each with i.i.d. CAf(0, 1) entries. As, by assumption, the 
rows of D have unit norm, we have that 

Q 



w££diag(d,)W,. 



9=1 

Hence, we can rewrite Y as 



diag(d 9 )Y, 



where 



(19) 



(20) 



Note now that each Y q is the output of a SIMO channel with R 
receive antennas, rank-1 channel covariance matrix, and SNR p. 
Realizing that, by and (pO), x -> {Yi, . . . , Y Q } -> Y 



forms a Markov chain, we conclude, by the data-processing 



inequality [20 Sec. 2.8], that 



7(Y;x)<7(Y 1 ,...,Y Q ;x). 

The claim now follows by noting that the L x (RQ) matrix 
obtained by stacking the matrices Y q next to each other can be 
interpreted as the output of a SIMO channel with RQ receive 
antennas, rank-1 covariance matrix, independent fading across 



receive antennas, and SNR p. The proof is completed by upper- 
bounding the capacity of this channel by means of the following 
lemma. 

Lemma 2. The capacity of the SIMO channel with R 
receive antennas, Q = 1, and L > 2 can be upper-bounded 
according to 

C{p)< (l-l/i)logp + 0(l), p^^. 

This result follows from jTj Eq. (27)]. A simpler and more 
detailed proof can be found in fT8] Eq. (7)]. 



VI. Proof of the Lower Bound ( fT2[ ) 
To help the reader navigate through the proof of the lower 



bound ( 12 1, we start by explaining the architecture of the proof. 



A. Architecture of the proof 

The proof consists of the following steps, each of which 
corresponds to a subsection in this section: 
Step 1: Choose an input distribution; we will see that i.i.d. 
CjV(0, 1) input symbols allow us to establish the ca- 



Step 2: 



Step 5: 



pacity pre-log lower bound ( 12 1. 
Decompose the mutual information between the input 
and the output of the channel according to I(x; y) = 

My) -My I *)■ 

Step 3: Using standard information-theoretic bounds show that 

h(y | x) is upper-bounded byRQ log(p) + 0(1). 
Step 4: Split h(y) into three terms: a term that depends on SNR, 
a differential entropy term that depends on the noiseless 
channel output y only, and a differential entropy term 
that depends on the noise vector w only. Conclude that 
the last of these three terms is a finite constant 
Conclude that the SNR-dependent term obtained in 
Stepgscales (in SNR) as min[i?Q + L — 1, RL] log(p). 
Together with the decomposition from Step [2] and the 
result from Step[3]this gives the desired lower bound ( 12 1 
provided that the y-dependent differential entropy ob- 
tained in Step|4]can be lower-bounded by a finite con- 
stant. 

To show that the y-dependent differential entropy ob- 
tained in Step[4]can be lower-bounded by a finite con- 
stant, apply the change of variables y — > (x, s) to 
rewrite the differential entropy as a sum of the differen- 
tial entropy of (x, s) and the expected (w.r.t. x and s) 
logarithm of the lacobian determinant corresponding to 
the transformation y — > (x, s). Conclude that the differ- 
ential entropy of (x, s) is a finite constant. It remains 
to show that the expected logarithm of the Jacobian 
determinant is lower-bounded by a finite constant as 
well. 

Step 7: Factor out the x-dependent terms from the expected 
logarithm of the Jacobian determinant and conclude 
that these terms are finite constants. It remains to show 
that the expected logarithm of the s-dependent factor 
in the Jacobian determinant is lower-bounded by a 

4 Here, and in what follows, whenever we say "finite constant", we mean 
SNR-independent and finite. 



Step 6: 
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finite constant as well. This poses the greatest technical manner as follows: 



difficulties in the proof of the lower bound ( 12 1 and is 
addressed in the remaining steps. 

Step 8: Based on a deep result from algebraic geometry, known 
as Hironaka's Theorem on the Resolution of Singular- 
ities, conclude that the expected logarithm of the s- 
dependent factor in the Jacobian determinant is lower- 
bounded by a finite constant, provided that this factor 
is nonzero for at least one element in its domain. 

Step 9: Prove by explicit construction that there exists at least 
one s, for which the s-dependent factor in the Jacobian 
determinant is nonzero. 

We next implement the proof program outlined above. 



B. Step 1: Choice of input distribution 

First note that for Q = L the lower bound in ( p"2| ) is reduced 
to x > and is hence trivially satisfied. In the remainder of the 
paper we shall therefore assume that Q < L. 

We shall furthermore work under the assumption 



R < 



L - 1 
L-Q 



(21) 



which trivially leads to a capacity pre-log lower bound as capacity 
is a nondecreasing function of R (one can always switch off 
receive antennas). 

A capacity lower bound is trivially obtained by evaluating the 
mutual information in Q for an appropriate input distribution. 
Specifically, we take i.i.d. x 4 ~ CAf (0,1), i € This 
implies that /i(xj) > — oo,i € [I'L], and, hence 1 19 Lem. 
6.7], 



Epogflxil)] > -oo, ie[l:L]. 



(22) 



We point out that every input vector with i.i.d., zero mean, unit 
variance entries x,; that satisfy /i(xj) > — oo, i € [1 : L], would 
allow us to prove ( 12 1. The choice x, ~ CAf(0, 1) is made for 



concreteness and convenience. 

C. Step 2: Mutual information decomposition 
Decompose 

/(x;y) = / l (y)-/ l (y|x) 



(23) 



and separately bound the two differential entropy terms for the 
input distribution chosen in Step 1 . 



D. Step 3: Analysis of '/i(y|x) 

As y conditioned on x is JPG, the conditional differential 
entropy h(y | x) can be upper-bounded in a straightforward 



ft(y|x) = RL\og{ne) 

+ E x [logdet(l M + p ® XD) E s [ss H ] (l R ® D H X H ))] 
= RL log (Tre) + i?E x [logdet(l L + p(XDD H X H ))] 
= RL log (Tre) +i?E x [log det(l Q + p(D H X H XD))] 



(a) 



< i?Llog(7re) + i?logdet(l Q +p(D H E x [X H X]D)) 

Q 

= RL log(Tre) + Rj^ M 1 + A (D Hd )) 

i=i 

< RQ\og(p) + 0(l). 



(24) 



Here, (a) follows from Jensen's inequality, and (b) holds because 
D has rank Q and, therefore, A, (D H D) > for all i £ [1:Q]. 

E. Step 4: Splitting h(y) into three terms 

Finding an asymptotically (in SNR) tight lower bound on 
h(y) is the main technical challenge of the proof of Theorem [T] 
The back-of-the-envelope calculation presented in Section [III] 
suggests that the problem can be approached by splitting h(y) 
into a term that depends on the noiseless channel output y = 
(I/j £g) XD) s only and a term that depends on noise w only. This 
can be realized as follows. 

Consider a set of indices I C [1 : LR] (we shall later discuss 
how to choose I) and define the following projection matrices 

p = (H 

We can lower-bound h(y) according to 
h(y) = h(Py,Qy) 

( = ) MPy) + MQy|Py) 



(b) 

> M\/p p y + Pw I Pw ) 



MQy + Qw|Qy,Py) 



id) 



K^pPy) + ft(Qw | Py) 
H^pPy) + ft(Qw) 
\1\ log(p) + h(Py) + c 



(25) 



Here, (a) follows by the chain rule for differential entropy; 
(b) follows from |6]), and because conditioning reduces en- 
tropy; (c) follows because differential entropy is invariant under 
translations and because w and y are independent; (d) follows 
because Qw and Py are independent; and in (e) we used the fact 
that Py is a \l\ -dimensional vector and /i(Qw) = c, where c 
here and in what follows denotes a constant that is independent 
of p and can take a different value at each appearance. 

Through this chain of inequalities, we disposed of noise w and 
isolated SNR-dependence into a separate term. This corresponds 
to considering the noise-free IO relation (|6| in the back-of-the- 
envelope calculation. Note further that we also rid ourselves of 
the components of y indexed by [1 : LR] \ X; this corresponds to 
eliminating unnecessary equations in the back-of-the-envelope 
calculation. The specific choice of the set I is crucial and will 
be discussed next. 
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F. Step 5: Analysis of the SNR- dependent term in ( |25| ) 

If h(Py) > -oo, we can substitute ((251 and ([24} into ((23 i 



which then yields a capacity lower bound of the form 

\x\ - RQ 



C( P )> 



L 



Iog(p) + 0(l) 



(26) 



This bound needs to be tightened by choosing the set X such that 
|l| is as large as possible while guaranteeing h(Py) > — oo. 
Comparing the lower bound ( |26| to the upper bound ( fTT) we see 
that the bounds match if 



\X\ = mm[RQ + L- 1,RL]. 



(27) 



Condition ( |27] > dictates that for i?L < i?Q + L — 1 we must set 
X = [1 : RL], which yields Py = y. When RL > RQ + L - 1 
the set X must be a proper subset of [1 : RL] . Specifically, we 
shall choose X as follows. Set 

k= {R{L-Q)-{L-l), ifRL> RQ + L-1 
(0, if RL< RQ + L-1, 



let 



[(r-l)L + l:rL- 
[{r -\)L + l:rL], 



1], l<r<fi 

£ + 1 < r < i?, 



and define I = lj r=1 X r . 

This choice can be verified to satisfy ( |27] i. Obviously, this is 
not the only choice for X that satisfies ( |27[ ). The specific set I 
chosen here will be seen to guarantee h{Py) > — oo and at the 



same time simplify the calculations in Section [yi-T 



Substituting ( |27| ) into ( |26| ), we obtain the desired result ( p~2| ), 
provided that h(Py) > — oo. Establishing that /i(Py) > — oo is, 
as already mentioned, the major technical difficulty in the proof 
of Theorem[T]and will be addressed next. 

G. Step 6: Analysis of h(Py) through change of variables 

It is difficult to analyze h(Py) directly since y = 
(I/j (g) XD) s depends on the pair of variables (s, x) in a nonlinear 
fashion. We have seen, in Section [TTTJ that ([6} has a unique 
solution in (s, x), provided that the appropriate number of pilot 
symbols is used. This suggests that there must be a one-to-one 
correspondence between Py and the pair (s, x). The existence 
of such a one-to-one correspondence allows us to locally lin- 
earize the equation y = (Ir ® XD) s and to relate h(Py) to 
h(s, x) = h(s) + h(x). This idea is key to bringing h(Py) into 
a form that eventually allows us to conclude that h(Py) > — oo. 

Formally, it is possible to relate the differential entropies of 
two random vectors of the same dimension that are related by 
a deterministic one-to-one function (in the sense of [21, p.7]) 
according to the following lemma. 

Lemma 3 (Transformation of differential entropy). Assume that 
g : — > C N is a continuous vector-valued function that is 
one-to-one and differentiable almo st everywhere (a.e.) on C , 
Let u G C N be a continuous nl7\ Sec. 2-3, Def. (2)] random 
vector (i.e., it has a well-defined PDF) and let v = g(u). Then 

h(v) = h(u) + 2E u [log |det(0g/0u)|] , 

where dg/du is the Jacobian of the function g(-). 



The proof follows from the change-of-variables theorem for 
integrals [21 Thm. 7.26] and is given in Appendix [B] for com- 
pleteness since the version of the theorem for complex-valued 
functions does not seem to be well documented in the literature. 



Note that Py € C |x| with \X\ given in (27 1 and [s T x T ] T e 
C RQ+L . Since \X\ < RQ + L (see |27|), the vectors Py and 
[s T x T ] T are of different dimensions and Lemma[3]can therefore 
not be applied directly to relate h(Py) to h(s, x). This problem 
can be resolved by conditioning on a subset V C [l'.L] (specified 
below) of components of x according to 



h(Py) >/i(Py|x P ). 



(29) 



The components xp correspond to the pilot symbols in the 
back-of-the-envelope calculation. The set V is chosen such that 
(i) the set of remaining components in x, J = [1 ; L] \ V, 
is of appropriate size ensuring that Py and [s T Xj] T are of 
the same dimension, and (ii) Py and [s T Xj] T are related by 
a deterministic bijection so that Lemma [3] can be applied to 
relate h(Py | xp) to h(s,xj \ xp). Specifically, set 



max[l, RQ + L - RL], 



(30) 



let V = which implies J = [a + 1: L]. Observe 



that Py (conditioned on xp) depends only on [s 1 x^] T , and 
due to our choice of J (it is actually the choice of \j\ that is 
important here), the vectors Py and [s T Xj] T are of the same 
dimension. Furthermore, these two vectors are related through 
a deterministic bijection: Consider the vector-valued function 

g_ : cm -+ cm 



g x „(s, Xi7 )=P(I R <g>XD)s. 



(31) 



Here, and whenever we refer to the function g XT >( ) in the 
following, we use the convention that the parameter vector 
xp e C' 73 ' and the variable vector xj £ C>^> are stacked into 
the vector x = [x^ x~j] T and we set X = diag(x). 

Lemma 4. T/'x-p has nonzero components only, i.e., Xi 7^ for 
all i G V , then the function g Xp (•) is one-to-one a.e. on C' Z L 

The proof of Lemma|4]is given in Appendix [C] and is based 
on the results obtained later in this section. We therefore invite 
the reader to first study the remainder of Section[V]and to return 
to Appendix [C] afterwards . 

Recall that Py = P(I fi ®XD)s and hence Py = 
g XT , (s, xj). Therefore, it follows from Lemma|4]that as long as 
xp = xp is fixed and satisfies x$ 7^ 0, for all i E V, Py and 
[s T Xj-] T are related through the bijection g Xp (•) as claimed. 

Comments: A few comments on Lemma|4]are in order. For 
L = 3 and R = Q = 2 as in the simple example in Section III 
we see from (27 1 that I = [1 : RL] so that P = 1^^ and Py = y 



Further, for this example, it follows from ( 30 1 that a = 1 and 
hence V = {1} and J = {2,3}. Therefore, Lemma |4] simply 
says that |6]l has a unique solution for fixed Xi 7^ 0. As already 
mentioned, conditioning w.r.t. xp = Xi in ( f29( > in order to make 
the relation between Py and [s T Xj] T be one-to-one corresponds 
to transmitting a pilot symbol, as was done in the back-of-the- 
envelope calculation by setting Xi = 1. 
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We can now use Lemma[3]to relate h(Py | xp) to h(s, xj) as 
follows. Let f Xv (•) denote the PDF of xp. Then, we can write 

h(Py | x P ) = Jf XT (x P )h(Py | x P = x P ) dx v . (32) 

<9gxp 



Let 



J(s,x) = 



9(s,xj) 



(33) 



be the Jacobian of the mapping in pT) (where we again 
use the convention x = [x v Xj] T ). Applying Lemma pi to 

h(Py | x-p = xp), we get for all x-p with x t ^ 0, i £ ?, that 



M P Y I xp = x-p) = /i(s, xj\x v = x v ) 
+ 2E s , x Jlog(|detJ(s,x)|) 



X-p = Xp 



(34) 



Substituting p4|) into (32i, we finally obtain 



K?y\xp) 

(a) 



jfx T (x-p)/i(s, Xj | Xp = Xp) dxp 
+ 2 // Xp (xp)E s , x Jlog(|detJ(s,x)|) 



Xp = Xp 



h(s,xj\x P ) + 2 E s , x [log(|det(J(s,x))|)] . 



dxp 
(35) 



Here, in (a), to be able to use ( 34 1, we exclude the set {xp \xi = 
for at least one i £ V} from the domain of integration. This 
is legitimate since that set has measure zero. 

The first term on the right-hand side (RHS) of ( |3~5j ) satisfies 



h(s, xj | x v ) = h(s | x v ) + h(xj | s, xp) 
= h(s) + h{xj) = c, 



(36) 



where (a) follows by the chain rule for differential entropy; in 

(b) we used that x is independent of s, and xp is independent 
of Xj because the x i; i 6 [1 : L], are i.i.d. and J n V = 0; and 

(c) follows because the Xj, i E J, and the s^, i € [1 : RQ], are 
i.i.d. and have finite differential entropy, by assumption. 



Combining (136), ( 35 1, and (129), we obtain 



MPy) >c + 2E SiX [log(|det(J(s,x))|)]. 

To show that h(Py) > — oo, it therefore remains to prove that 

E B , x [log(|det(J(s,x))|)] > -oo. (37) 

This requires an in-depth analysis of the structure of |det(J(-)) |, 
which will be carried out in the next section. 

H. Step 7: Factorization of det(J(-)) and analysis of x- 
dependent terms 

The following lemma shows that the determinant of the Jaco- 
bian in ( |33| ) can be factorized into a product of simpler terms. 

Lemma 5. The determinant of the Jacobian in ( |33) factorizes 
as 

det(J(s,x)) = det(Ji(x))det(J 2 (s))det(J 3 (x, 7 )) , 



where 



with 



Ji(x) ^ P(I fl (g)X)P T 

J 2 (s) = P[Ifl®D|a a+ i|---|a L ] (38) 
J 3 (xj) = &\a,g(l RQl (diag(x i7 )) _1 ) 



&i ± ® diag( ei )D)s, i€j = [l:L}. (39) 



Proof: First note that g Xp (s, xj) in ( 31 1 can be written as 



g x „(s,xj)= ^2 x j{ l R® diag(ej-)D)s 
oe[i-.L] 



and, therefore, 



d 



dxi dx 



( X! ^ (Ifl® diag(ej)D)s 

je[l:L] 
i & J . 



With 



<9g x 



9s 



= Ir (8> XD 



we can now rewrite the Jacobian in (|33l as 



J(s,x) = P[Ifl<g)XD | a Q+ i | ••• | a L ] 

= (P(I fl <g> X)P T ) J 2 (s) diag(I flQ , (diag(xj))- 1 ), 

(40) 

which concludes the proof. ■ 
Using Lemma [5] we can rewrite the second term on the RHS 
of ( 35 1 according to 



E S!X [log(|det(J(s,x))|)]=E[log(|det(J 1 (x))|)] 

+ E[log(|det(J 2 (s))|)] 

+ EpogfldetGJafo))!)] . (41) 

The first and the third term in ( pT) can be expanded as 



L-l 



E[log(|det(J 1 (x))|)] = i?^E[log(|x 



+ (i?- J R)^E[log(|x J |)] (42) 
Epog(|det(J 3 &7))|)] = - E [ lo g(M)] ■ (43) 

Using p2) , Q, and Jensen's inequality, we have 

-oo<E[log(|x,-|)] <log(E[| Xj -|]) <oo, 

which immediately implies that the terms on the left-hand side 
(LHS) of (|42) and (|43) are finite. It remains to show that 

E[log(|det(J 2 (s))|)]> -oo. 
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/. Step 8: Proving E [log(|det(J2(s))|)] > — oo through reso- 
lution of singularities 

This is the most technical part of the proof of Theorem[T] We 
need to show that 

E[log(|det(J 2 (s))|)] 
1 



T RQ 



C RQ 



exp(-||s|| 2 )log(|det(J 2 (s))|)ds > -oo. (44) 



Since J2G) is a large matrix with little structure to exploit, a 
direct evaluation of the integral in d44b seems daunting. Note, 



however, that by Q, <(39j, and [22, 4.2.1(2)] it follows that 
det(J2(s)) is a homogeneous polynomial in si, . . . , srq; in 
other words det(J2( )) is a well-behaved function of its argu- 
ments. It turns out that this mild property is sufficient to prove 
the inequality in ( |4"4"| ). The proof, however, requires powerful 
tools, which will be described next. 

Lemma 6. Let p(u), u G C^, be a homogeneous polynomial 
inui, . . . , un. Then, p(-) ^ implies that 



C N 



exp(- 



ilog(|p(u)|)du > -c 



Lemma[6]is proved in Appendix|D]using the following general 
result, which is a consequence of Hironaka's Theorem on the 



Resolution of Singularities 1 1 1 Theorem 2.3]. 

Theorem 7. Let /(•) ^ be a real analytic function^] fil4\ Def. 
2.2.1] on an open set Q C 



Then 



|log(|/(u)|)|du< 



(45) 



for all compact sets A C Q. 

For a formal proof of Theorem [7] see Appendix [E] Here, we 
explain intuitively why this result holds. The only reason why 
the integral in (45 1 could diverge, is because |/(-)| ma Y take on 
the value zero and log(O) = —00. Since /(•) is a real analytic 
function and since /(•) ^ 0, the zero set / _1 ({0}) has measure 
zero. To prove ( |4"5] >, it remains to examine the detailed behavior of 
/(•) around the zero set / _1 ({0}). The integral of |log(|/(-)|)| 
over a small enough neighborhood around each smooth (i.e. 
nonsingular) point in the zero set is bounded, but it is difficult 
to determine what happens near the singularities. Hironaka's 
Theorem on the Resolution of Singularities "untangles" the 
singularities so that we can understand their structure. More 
formally, Hironaka's Theorem states that in a small neighbor- 
hood around every point in / _1 ({0}), the real analytic function 
/(•) behaves like a product of a monomial of finite degree 
and a nonvanishing real analytic function. The integral of the 
logarithm of the absolute value of this product over a small 
enough neighborhood around each point in / _1 ({0}) is then 
easily bounded and turns out to be finite. The union of the 
neighborhoods of the points in / _1 ({0}) forms an open cover 
for / _1 ({0}). Since A is a compact set, it is possible to find a 
finite subcover for / _1 ({0}). Summing up the integrals over the 
elements of this subcover, each of which is finite as explained 

5 Let £7 be an open subset of WL K . A function /(■) : Q — > Ris real analytic 
if for every xo S £2, /(■) can be represented by a convergent power series in 
some neighborhood of xq . 



above, allows us to deduce that the integral in ( |45[ ) must be finite 
as well. 

On account of Lemma[6] to show (44 1 it suffices to verify that 
det(J 2 (-)) ^ 0. This is indeed the case as demonstrated next. 



J. Step 9: Identifying an sfor which det(J2(s)) ^ 

Lemma 8. Property (A) in Theorem^implies that 
det(J 2 (-)) ^ 0. 

Proof: The proof is effected by showing that Property (A) 
implies the existence of a vector s G C R ® such that det(J2(s)) ^ 
0. To this end, we first note that J2 (s) in ( |38] l can be written as 

J 2 (s) = [P(Ir®D) A] with 



A^ 



Ar 

lAr. 



(46) 



and 



A,; = 



/_ o Q • 

d a +l s i 

V 

/_ o a 

d a +l s i 



V • 






Q 







Q \ 




0/ 

On 







dl* J 



i € [l:R], 



i G [R + 1:R]. 



Here, a was defined in ( |30| l; Q denotes an all-zero vector of 
dimension a; di, . . . ,dx e are the transposed rows of the 
LxQ matrix D so that D T = [di • • • dj,]; and the Sj € C Q , i € 



[1 : R], are defined through s = [s[ 



-,TlT 



. The calculations 



below are somewhat tedious but the idea is simple. Thanks to 
Property (A) in Theorem [T| it is possible to find vectors s, e 
, i G [1: R], such that each column of the matrix A defined 



in ( 46 1 has exactly one nonzero element. For this choice of Sj € 
, we can then conveniently factorize | J 2 (s) | using 
the Laplace formula [23 1 p. 7]; the resulting factors are easily 
seen to all be nonzero. We next detail the program just outlined. 
Take an i e [i-'R] and consider a set K L satisfying 




if i g [1:R], 
ifi G [R+1:R], 



with 



Q-i, 

(R-1)(L-Q), 



if RL > RQ 
if RL < RQ 



L- 1 
L — 1. 



(47) 



(48) 



The freedom in choice of the set fC{ will be used later to ensure 
that each column of the matrix A has exactly one nonzero 
element. We shall next show that the vector Sj G can be 
chosen such that the entries of Ai given by djs^, j G /Q, equal 
zero and the entries djsj, j ' ^ /C,, are nonzero. Since, by (48 1, 
< Q — 1> Property (A) in Theorem [T] guarantees that the 
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Fig. 2. Choice of the vector Si forL = 4, Q = 3, a = 1, Ki = {2,3}, K.\ 
{4}- 



vectors {dj}j e jCi are linearly independent. Furthermore, the 
vectors dy, j € /C£, with 



/C?== 




1:L-1]\/Ci, if ie [!:£], 



if i € l:i?], 



(49) 



do not belong to span{dj} je ^.. Hence, we can find a vector 
Si eC Q such that 

(a) djs, = for all j G /C;; 

(b) djs 4 ^ for all j e K%. 

Geometrically, this simply means that Sj must be chosen such 
that it is orthogonal to spanlcL,}^^ (which is a subspace of 
of dimension less than or equal to Q — 1) and, in addition, is not 
orthogonal to every vector in the set (see Fig.|2ji. Note 

that if Property (A) in Theorem [T] were not satisfied, we could 
have a vector df, j' g ICf, that belongs to the span{dj}j e /c i \ 
in this case there would not exist a vector that satisfies (a) and 



(b) simultaneously. Based on (30i, (48i, and (49 1, we can see 



that if the vector s 2 is chosen such that conditions (a) and (b) 
above are satisfied, the number of nonzero elements, \)C? I , in 
the matrix A; is [see ( |28"} ] 




if?: e [1:R], 
if?: e [R+l-.R]. 



Hence, applying the procedure described above to every i E 
[1 : R] and choosing the corresponding vector Sj such that (a) 
and (b) are satisfied, we obtain a matrix A [see |46}] with total 
number of nonzero elements equal to the number of columns in 
A and given by 

E \IC-\=L-a. 
ie[l:R] 

Now, recall that we have full freedom in our choice of /G , i G 



[1 : R], as long as (47 1 and (48 1 are satisfied; this implies that 
we have control over the locations of the nonzero elements of A. 
Hence, by appropriate choice of the sets /Cj, i £ [1 : R), we can 
ensure that each column of A contains precisely one nonzero 
element. 

Applying the Laplace formula [23 p. 7] iteratively, we then 
get 

|det(J 2 (s))| = c |det(D KiU[1:a] )|, (50) 

ie[l:R] 

where c is a positive constant. Finally, since for every i g [1: R], 



follows from Property (A) in Theorem [Tj that D^up^a] has 
linearly independent rows and hence Idet^DjQu^a])! > 0, for 
all i 6 [1: R], which by (50 1 concludes the proof. ■ 
The proof of TheoremTlfis now completed as follows. Com- 
bining Lemmas [6] and [8p we conclude tha t (|44| ) holds. Sub- 
stituting ( |44| into ( |4T| ) and using d42b and (|43| l, we conclude 
that ([37j» holds. Therefore, by ((29}, ((35}, and g6}, it follows that 
h(Py) > -co. 

VII. Conclusions and Future Work 

We characterized the capacity pre-log of a temporally cor- 
related block-fading SIMO channel in the noncoherent setting 
under a mild assumption on the channel covariance matrix. The 
most striking implication of this result is that the pre-log penalty 
in the SISO case due to channel uncertainty can be made to 
vanish in the large block length regime by adding only one 
receive antenna. 

It would be interesting to generalize the results in this paper to 
the MIMO case. Preliminary work in this direction was reported 
in |24j, which establishes a lower bound on the capacity pre- 
log of a temporally correlated block-fading MIMO channel. 
This lower bound is not accompanied by a matching upper 
bound so that the problem of determining the capacity pre- 
log in the MIMO case remains open. It is also interesting to 
note that [24| avoids the use of Hironaka's theorem through an 
alternative proof technique based on properties of subharmonic 
functions. 

Further interesting open questions include the generalization 
of the results in this paper to the stationary case and the de- 
velopment of coding schemes that achieve the SIMO capacity 
pre-log. 

Appendix A 
Proof of (p~6l> 



^>tCiU[i-.a] is a Q x Q submatrix of D [see (30i and (48 1], it 



The following calculation repeats the steps in 1 19 Thm. 4.2] 
and is provided for the reader's convenience: 

Q 

/(Y c ;x e ) = E^(Y {9} ;x s | Y [1:g _ 1] ) 

q=l 

Q 

= E( / ( y w; y [i :9 -i]' x s)- / ( y m; y [i: 9 -h)) 

9=1 

Q 

- E J ( Y {9}' Y [1: 9 -1]> X S) 
9=1 

Q 

= E ( J ( Y {9}; Y [l:«-l]i H [l:9-l]! X s) 
9=1 

- /(Y M ; H^.!] | Y (1 .,_!], x Q )) 

Q 

- E / ( Y {?} ; Y [l:g-1]> H [1:«-1]' X Q) 
9=1 

(a) Q 

= E 7 ( Y {9}; H [1:9-1]' X 9) 

9=1 

Q 

= E( / ( y m; h [i^-i]I x 9 ) + / ( y m; x 9 )) 

9=1 
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(*>) 



Q 



< Y, J(Y {4} ; %:,_!] I x g )+Q log log(p)+0(l) 

9=1 

Q 

- E 7 ( Y w= x «; H [i:«-i]) + Qiogiog(p) + 

9=1 
Q 

9=1 

Q 

( = ) £7(H {g} ;H [1:? _ 1] )+Qloglog(p) + 0(l) 

9=1 

= Qh(H {1} ) - h(U Q ) + Q log log(p) + 0(1) 

R 

= Q J>(M - J Rlogdet(D Q D^) +Q log log(p) + 0(1) 



(e) 



Qloglog(p) + 0(1) 



where (a) follows because Y{ 9 } is conditionally independent 
ofx[ 1:(? _ 1 ] and of Yn . given x q and H[ 1 . g _ 1 j; (b) follows 
from [19 Th. 4.2]; (c) follows because x q is independent of 



Hh ;j _ij| (d) follows because Y q and x q are conditionally 
independent of H[i : given H g ; and (e) follows because the 
matrix Dq is full-rank and h(hn) = c, i E [1 : i?]. 

Appendix B 
Proof of Lemma[3] 

The lemma is based on the change of variables theorem for 
integrals, which we restate for the reader's convenience. 

Theorem 9. /|27] Thm. 7.26], £/5] p. 31, Thm. 7.2] Assume that 
g:UcC N -> 



N 



is a continuous vector-valued function that 
is one-to-one and differentiable a.e. on Li. Let V — g(U). Then, 

f{v)dv= f /(g(u))|det(9g/au)| 2 du 
/v Ju 

for every measurable f : C N — > [0, oo]. 

To prove Lemma [3] we let / v ( ) and / u ( ) denote the PDFs 
of random vectors v and u, respectively. Then, according to [25, 



(7-8)] and [15 p.31, Thm. 7.2] 



/v(g(u)) 



/u(u) 



det((9g/9u) 



(51) 



Next, let U and V denote the support of / u (-) and f v (-), 
respectively. Then, V = g{U) and, on account of Theorem[9] we 
have 



h(y) 



(a) 



/v(v) log(/ v (v))rfv 

/ v (g(u))log(/ v (g(u)))|det(9g/9u)| 2 du 

/u(u) 



u |det(9g/9u)| 2 

/u(u) 



x log 



. |det(9g/9u)| 2 
/ u (u) log(/ u (u))du 



|det(5g/9u)| 2 du 



+ 2 / / u (u)log(|det(9g/9u)|)du 
Ju 

= ^(u) + 2E u [log|det(a g /au)|] 
where in (a) we used pT[ ). This concludes the proof. 

Appendix C 
Proof of Lemma|4] 

We need to show that the function g XT , (s, xj) is one-to-one 
almost everywhere. It is therefore legitimate to exclude sets of 
measure zero from its domain. In particular, we consider the 
restriction of the function g XT (s,xj) to the set of pairs (s,xj) 
that satisfy 

(i) \xA > for all i e J; 



(ii) det J 2 (s) ^ with J 2 (-) defined in (38 1. 
Condition (i) excludes those xj from the domain of g Xp (•) that 
have at least one component equal to zero; since the Xi,i € J, 
take on values in a continuum, the excluded set has measure zero. 
Condition (ii) excludes those s from the domain of g XT , (•) that 
have det(J2(s)) = 0. Remember that we proved in Section VI-I 
(see (44 1) that E [log(|det(J 2 (s)) |)] > -oo, which implies 
det(J2(-)) ^ a.e. Therefore, the set excluded in (ii) must be 
a set of measure zero. We conclude that the set of pairs (s,xj) 
that violates at least one of the conditions (i) and (ii) is a set of 
measure zero. 

To show that the resulting restriction of the function g Xj ,(-) 
[which, with slight abuse of notation we still call g X73 ( )] is 
one-to-one, we take two pairs (s,xj) and (s,xj) from the 
domain of g Xp (-) and show that if g XT ,(s,x i7 ) = g Xv (s,xj), 
then necessarily (s,xj) — (s,xj). 

Indeed, assume that both (s,xj) and (s,xj) belong to the 
domain of g Xv (•), i.e., both pairs satisfy conditions (i) and (ii) 
above. Suppose that g XT , (s, xj) = g Xl= (s,xj), or, equivalently, 



P(I fl «)XD)s = P(lH«)XD)s 



(52) 



where x = [xp Xj] T , X = diag(x), x = [xip Xj-] T , and X 



diag(x). We next consider (52 1 as an equation parametrized by 
(s,xj) in the variables (s, xj) and show that this equation has 
a unique solution. Since (s,xj) = (s,xj) (trivially) satisfies 
(52 1, uniqueness then implies that (s,xj) = (s,xj). 

To prove that d52| has a unique solution, we follow the 



approach described in Section III and convert ( |52[ i into a linear 
system of equations through a change of variables. In particular, 
thanks to constraint we can left-multiply both sides of (52} 
by Pp^fgiX^pTppHtgiX]- 1 ? 1 " to transform ^ into the 
equivalent equation 



P (T B <8> X _1 D) s = P ( I„ <8> X _1 D 



(53) 



Next, perform the substitutions z< = l/xi, Si — 1/xi, i E 
[1 : L], define z = [z% . . . zl] t , and set Z = diag(z) so that (53} 
can be written as 



P(I fl <g)ZD)s = ^z,Pa 4 , 



(54) 



where a, = (I/? ® diag(ei)D)s, i G [1 : L], as defined in (39 1. 
Finally, moving the terms containing the unknowns z t , i E J, 



to the LHS of ( 54 1 while keeping the terms containing the fixed 
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parameters 2j, i 6 P, on the RHS, we transform ( |54| l into the 
equivalent equation 

P (l R ® ZD) s — ^ z.Pa, = ^ ^Pa, . (55) 



iev 



Defining zj = [z a +i ■ ■ ■ zl] T and using the expression for J(-) 
in d40l, we can write d55]l as 



J(s, 



s 



5 ' SjPai 
iev 



(56) 



The solution of ([56]) is unique if and only if det J(s, z) ^ 0. 
We use Lemma|5]to factorize det J(s, z) according to 

det(J(s,z)) = det(Ji(z))det(J 2 (s))det(J 3 (zj)) . (57) 

The first and the third term on the RHS of ( f57] ) can be written 
as follows 



det(Ji(z)) = 




(R-R) 



(R-R) 



det(j 3 ( Zi7 )) = n ^ = n *s 

jej 3 jej 

and are nonzero due to constraint |i| stated at the beginning 
of this Appendix; det(J2(s)) ^ due to constraint |n|. Hence 
det J(s, z) 7^ and the solution of d56j in the variables (s, z j) is 
unique. Therefore, the solution of ( 52 1 [parametrized by (s, Xj)\ 



in the variables (s,xj) is unique. This completes the proof. 

We conclude this section by closing an issue that was left 
open in the back-of-the-envelope calculation in Section III 
Specifically, we will show that the matrix B in |9]) is full-rank 
For L — 3 and R = Q = 2, the matrix B in ([9} is related to J(-) 
in (40 1 according to B = (I 2 <8> X)J(s, z) with z = [z 1 ... z L ] T . 

det(I 2 <E> X) det(J(s, z)). Since we assumed 
i € [l:i],wehavedet(l2 



Hence det(B) 



in Section 



III 



[ that | | > 0, i e [l:L],wehavedet(I 2 ®X) ^ 
Together with det(J(s, z)) ^ 0, a.e., as shown above, we can 
conclude that, indeed, dct(B) =^ 0, a.e., as claimed in Section III 



Appendix D 
Proof of Lemma[6] 

Instead of working with 



exp(- 



ilog(|p(u)|)du 



(58) 



it will turn out convenient to consider I and to show that \l < 
oo, which trivially implies / > — oo. As already mentioned, the 
proof of |/| < oo is based on Theorem |7] In order to be able 
to apply Theorem [7] we will need to transform the integration 
domain in ( 58 1 into a compact set in R 2N , transform the complex- 



valued polynomial p(-) into a real-valued function, and get rid of 
the term exp(— ||u|| 2 ). All this will be accomplished as follows. 
First, we bound |/| by a sum of two integrals over the set C^, 
then, we apply a change of variables to transform these two 
integrals into three new integrals. The first two of these three 



integrals are over the set [0, oo], which is still not compact, but 
the resulting integrals are simple enough to be bounded directly. 
The third integral is over a compact set and can, thus, be bounded 
using Theorem [7] We now implement the program just outlined. 

Let K denote the degree of the homogeneous polynomial p(-). 
Then, by homogeneity of p(-), 



p(u) =p[ ||u| 



= M\ k p 



and, therefore, 

/= / ex P (-||u|| 2 )log(b(u)|)du 
Jc N 

= K [ ex P (-|!u|| 2 )log(||u||)du 
Jc N 

v * ' 

h 



exp(-||u|| 2 )log(b(u/||u||)|)du. 



We next change variables in I\ and I2 by first transforming the 
domain of integration from C N to M. 2N and then using polar 
coordinates (261 p. 55]. Specifically, we introduce the function 
u : R 2N -> that acts according to 

u(v) = [vi +\v 2 ■■■ V2N-1 + if2Af] T , (59) 



and the function v : 
[0, 2tt] defined through 



x A 



l 2N with A 4 [0,^] 2W - 2 



with 



v(r, t) = rf(t) 



sin(ti) sin(t 2 ) • ■ • sin(t 2 Ar^2) sin(i 2 jv-i) 
sin(ix) sin(t 2 ) ■ • ■ sin(t 2 N-2) cos(i 2 jv-i) 
sin(ti) sin(t 2 ) . . . cos(i 2 Ar_ 2 ) 



(60) 



(61) 



sin(ti) cos(t 2 ) 
cos(ti) 

It follows from (|59|-(|6T) that 

||u(v(r,t))|| = ||v(r,t)||=r 

and therefore 

u(v(r,t)) uirM =u(f(t)) . 
||u(v(r,t))|| r 

The determinant of the Jacobian of the function v( ) is well- 
known and is given by [26] p. 55] 



det ■ 



3(r,t) 



r* lf - 1 B m(t 1 ) !tN -*Bm(t2) 2N - 3 . 



sin(t 2 jv-2) 



?(t) 



Changing variables in Ii and I 2 according to u — > v — > (r, t), 
we obtain 

h=K j exp(-r 2 )log(r)r 2JV ~ 1 g(t)drdt 

Jr,t 

h = ! cx P (-r 2 )log(b(u(f(t)))|)r 2Ar - 1 g(t)drdt. 
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By the triangle inequality we have 

|/|<|/i| + N. 

Using |g(t)| < 1, we get 

POO 

\h\ < K2tt 2M - 1 / cxp(-r 2 )|log(r)|r 2Ar - 1 dr < oo 



Hal < 



exp(— r jr 



2JV-1 



drx I |log(|Ku(f(t)))|)|dt 

A 



(62) 



<c I |log(|p(u(f(t)))| 2 )|rft. 

'A 

We hereby disposed of the integrals over unbounded domains 
and are left only with an integral over the compact set A. Note 
also that by absorbing a factor 1/2 into c we introduced a square 



in ( 62 1, which will turn out useful later. In order to prove that 



J < oo it now remains to show that 



T — 



|log(|p(u(f(t)))| 2 )|rft<(X). 



(63) 



Note that |p(u(f(-))) | 2 : A — > R + is a real analytic function 
by p4| Prop. 2.2.2], because it is a composition of the polynomial 
|p(u(-))| 2 : K 2W -> R+ and the function f(-) : A R 2N 
that has real analytic components (trigonometric functions are 
real analytic on R). Furthermore, by assumption, p(-) ^ 
and hence |p(u(f(-))) | 2 ^ 0. Finally, A is a compact set. The 
inequality ( |63"j ) now follows by application of Theorem [7] This 
concludes the proof. 

Appendix E 

Proof of Theorem |7] via resolution of singularities 

In order to prove Theoremr7|note that J AcR aj | log ( | /(u) | ) \du 
would clearly be finite if thefunction /(•) were bounded away 
from zero on the set A. Unfortunately, this is not the case. How- 
ever, because /(■) is real analytic and /(■) ^ 0, it can take on 
the value zero only on a set of measure zero [14 Cor. 1.2.6]. Es- 
tablishing whether the integral J AcR m |log(|/(u) |) \du is finite, 
hence requires a fine analysis of the behavior of |log(|/(-) |) | in 
the neighborhood of the zero-measure set / _1 ({0}). This can 
be accomplished using Hironaka's Theorem on the Resolution 
of Singularities, which allows one to write /(•) as a product 
of a monomial and a nonvanishing real analytic function in the 
neighborhood of each point u where /(u) = 0. The logarithm of 
this product can then easily be bounded and shown to be finite. 
As the tools used in the following are non-standard, at least in the 
information theory literature, we review the main ingredients in 
some detail. Formally, Hironaka's Theorem states the following: 



Theorem 10. [11 Theorem 2.3] Let /(•) ^ be a real analytic 

function \14\ Def. 1.1.5] from a neighborhood of the origin 0, 

denoted fi C M. K , to R, which satisfies /(0) = 0. Then, there 

exists a triple (W, Ai, ip(-)) such that 

(a) W C il is an open set in R K with G W, 

(bj M. is a K -dimensional real analytic manifold [11, Def. 2.10] 



with coordinate charts {A4 p ,(p p : C(0,e p ) — ► A4 p } for 
each point p € M, where ip p (-) is an isomorphisn^between 
C(0, e p ) and M, p with y p (0) = p. 

6 Let hi and V be two real analytic manifolds. A real analytic map / : U — > V 
is called an isomorphism between U C U and V C V if it is one-to-one and an 
onto map from U to V whose inverse on V is also a real analytic map. 



(c) ip '■ M — > W is a real analytic map, 
that satisfies the following conditions: 

(i) The map ip(-) is proper, i.e., the inverse image of every 
compact set under ip(-) is compact. 

(ii) The map ip(-) is an isomorphism j[ll\ Def. 2.5] between 
M\(f o^({0}) and W\f-\{0}). 

(Hi) For every point p G A4 n ((/ o ?/;) _1 ({0})), there exist 
m p: n P G and a real analytic function g p (-) that is 
bounded and nonvanishing on C(0, e p ) such that 

|(/oVo^ p )(v)| -v m p, for all vGC(0,e p ) 

and the determinant of the Jacobian of the mapping (ip o 
V?p)( - ) satisfies 

<1H;| 9( ^°/ p) ) =5 P (v)v n P , for all vGC(0,e p ). 



Thanks to Theorem 10 in the neighborhood of zero, every 
real analytic function that satisfies /(•) ^ and /(0) = 
can be written as a product of a monomial and a nonvanishing 



real analytic function. In order to bound the integral in (45 I, we 
will need to represent /(•) in this form in the neighborhood of 
every point in the domain of integration. This representation 
can be obtained by analyzing two cases separately. For points x 
such that /(x) ^ 0, by real-analyticity and, hence, continuity, it 
follows that /(•) is already nonvanishing in the neighborhood of 
x and is hence trivially representable as a product of a monomial 
and a nonvanishing real analytic function. For points x such 
that /(x) = 0, the desired representation can be obtained by 
appropriately shifting the origin in Theorem [10] The following 
straightforward corollary to Theorem[lO]conveniently formalizes 
these statements in a unified fashion. 

Corollary 11. Let /(■) ^ be a real analytic function from a 
neighborhood ofu G M. K , denoted fl C R , to R. Then, there 
exists a triple (W, M., such that 

(a) W C £1 is an open set in M. K with u G W, 

(b) Mis a K -dimensional real analytic manifold [11, Def. 2.10] 
with coordinate charts {A4 pi tp p : C(0,e p ) — » Ai p } for 
each point p G M, where M p is an open set with p G M p 
and (p p (-) is an isomorphism between C(0, e p ) and M. p with 
</3 p (0) = p. 

(c) ip : M — > W is a real analytic map, that satisfies the 
following conditions: 

(i) The map ip(-) is proper, i.e., the inverse image of any 
compact set under ip(-) is compact. 

(ii) The map (V , °9 3 p)(*) is an isomorphism between C(0, e p )\ 
(/ o V ° fprHiO}) and^(M p ) \ f-H{0}). 

( Hi) For every point p G A4, there exist m p , n p G Nq and real 
analytic functions h p (-) and g p (-) that are bounded and 
nonvanishing on C(0, e p ) such that 

|(/o^o(^ p )(v)| = h p (v)v n \ for all vGC(0,e p ) 

(64) 

and the determinant of the Jacobian of the mapping 
(if) o ¥>p)( - ) satisfies 



det 



dp o <fip) 
dv 



5p (v)v n P , for all vGC(0,e p ). 



15 



Proof: First consider u such that /(u) 7^ 0. As already 
mentioned, in this case the statement of the corollary is a pure 
formality since / (•) itself is a nonvanishing real analytic function 
in the neighborhood of u. Formally, since /(•) is real analytic 
and, hence, continuous, there exists an open cube C(u, e) on 
which /(■) is uniformly bounded and satisfies f(v) 7^ for 
all v £ C(u, e). In this case, the corollary, therefore, follows 
immediately by choosing M = C(u, e), W = C(u, e), setting 
tp(-) to be the identity map, defining M p = M for all p G M, 
and setting </? p (v) = v + p for all v £ C(0, e). 

Next, consider the more complicated case /(u) = 0. The main 
to the function /(t) = /(t+u), t G 
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idea is to apply Theorem 

57 — u. Theorem 1 1 0| implies that there exists a triple (W,M,ip) 
that satisfies |[a}- 
let 



and (JTJ — (|TTTJ> in Theorem 10 for /(•). Now 



W = W + 
M = M 



Then (|a])-(|c]i and ^ in the statement of Corollary 11 follow 
immediately from <[a]»— (jcj and |i| in Theorem 10 



Condition <[TTJ in the statement of Corollary 1 1 follows from 
|n]l in Theorem 1 10| and the fact that </? p ( ) is an isomorphism 
between C(0, e p ) and M p . 

To verify ( pUj i in the statement of Corollary [TT| consider the 
following two cases separately. First, let p £ M such that (/ o 
V0(p) = 0. Then (|m]i in the statement of Corollary 1 1 1 1 follows 
from ( full ) in Theorem [10] and the fact that 



(f°ip° <p P )(y 
det( 8( t^ 



(f ° $ ° <p p )(v) 
d(4>oip p ) 



det 



9v 



for all v £ C(0,e p ) 
| , for all v £ C(0,e p ) 



Second, let p e M with (/ o V0(p) 7^ 0. As (/ o -0)(p) = 
(/ V0(p)> this implies that (/ o V0(p) 7^ 0. Since /(•) is a 
continuous function (as a translation of /(•) that is real analytic 
and hence continuous), there exists an e p > such that /(•) is 
bounded and nonvanishing on the open cube C(ip(p), e p ). Now 
In) in Theorem 10 implies that tjj(-) is an isomorphism, i.e., 

^:V;- 1 (C(^(p),e p ))^C(^(p),e p ). 

Define ip p (v) = ^" 1 (v+^(p)) for v £ C(0, e p ). Then y> p (0) = 
p and 

f(4> 0( P P )( v ) = (/ o- Ol ^p)(v) 

= /(v + ^(p)), for all veC(0,e p ). 

Therefore, we can simply set h p (v) = /(v + ip(p)) and the 
representation i64i is obtained. Furthermore, since ip(ip p (v)) = 
t^(Vp(v)) + u = ^(^ _1 (v + V>(p))) + u = v + r/i(p) + u, we 
have 



detf^^p) 
\ ov 



1, for all v G C(0,e p ) 



We now have all the ingredients required to prove Theorem [7] 
Proof: For each u G A, Corollary 1 1 implies that there 
exists a triple (W u , .M u , ^> u ) such that W u C f2 is an open set 



containing u, M u is a real analytic manifold, and : Al u — > 
W u is a proper map. Furthermore, for each p G M u there 
exists a coordinate chart {.M UjP ,<£ U)P : C(0,e u p ) — > .M U)P }, 
where A4 U . P is an open set with p G M UtP and <£> u ,p( - ) is an 
isomorphism between C(0, e u p ) and -M u ,p with (p u p (0) = p, 
such that (t/) u o ((C UiP )(-) is a real analytic map [11 p.49] on 
C(0,e U)P ) and 

|(/ o ip u o p UjP )(v)| = /iu :P (v)v m "-p 
det I ^ I = .9u :P (v)v »* 

for all v G C(0, e U)P ), where <? U . P (0 and h UtP (-) are real analytic 
functions that are nonvanishing on C(0, e u p ). Now, for each 
u£ Awe choose an open neighborhood of u, denoted as W' n , 
and a compact neighborhood of u, denoted A u , such that u G 
C A u C W u - Since A is a compact set (27] 2.31] there 
exists a finite set of vectors {ui, . . . , vln} with u, € A such that 



A C |J W[ 

ie[l:N] 



C (J A, 

i£[l:JV] 



where we set W- = W u . and A, ; = A Ui for i G [1:N]. Take an 
i £ [1 : N] and set Mi = M Ui , W; = W Ul , and fa = ^ . Since 
the mapping ip, : Mi — > Wj is proper, the set V>£ (Aj) C .M; 
is a compact set. Therefore, there exists a finite number Mi of 
points pi , . . . , p^fj G Mi such that 



^ _1 (A,) 



je[l:M 4 ] 



(65) 



with = M U4 ,p 3 - Since ( |65j ) holds for all i G [l:iV], we 

can upper-bound the integral in (p3)l as follows: 



|log(|/(u)|)|du 



< E / |log(|/(u)|)|du 



i£[l:iV] ' 

< E E / |io g (i/(u)i)|du 

ie [l : Ar ]i6[ l :Ml ] *'A 4 n^(M < , J ) 

^ E E / |log(|/(u)|)|du. (66) 

iG[l:Af]i6[l:A/ l ] J ^( >1 '.^ 

Since /(•) is a real analytic function and, hence, / _1 ({0}) is a 
set of measure zero, we have 



|log(|/(u)|)|du= / |log(|/(u)|)|du. 

(67) 



Next, recall that according to (JuJ) in Corollary 1 1 (ipi o (p Pj )(-) 
is an isomorphism between Ci.j = C(0, e UilP ) \ (/ ° ip% ° 
^ Pj ) _1 ({0}) and ip i (M l ,j)\f- 1 ({0}). Therefore, wecan apply 
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the change of variables theorem plj Theorem 7.26] to get 

|Iog(|/(u)|)|du 

| ffl .,(v)v n - log(|^(v)v m ^|)|dv 

Ci,j 

< sup (|^(v)v n -|) / |log(|^-(v)v^|)|dv 

vtdj Jc i}j 



(o) 



|log(|v m ^|)|dv 



+ Cij / |log (|/ii,j(v)|) |dv 



(*>) 

< c 



K 



■ ' / - • / I ^[ nli j] fcl0g (l ufe l) 

>J fc=l 

+ Ci>j sup (|log(|/i lJ (v)|)|)(2e 2J ) K 



cfoi . . . dv 



K 



| log • • -rfuit + 



(C) V^r l 

< c 4J ^Kj] 
fe=i 
it 

= c iJ ^[mi )J -] fc (2e i)i )( K - 1 ) / " |log(|u|)|efo+c 
fc=i 



1,3 



(d) 

< oo. 



(68) 



Here, Ci t j,Ci t j,Cij > 0, z € [1:./V],j e [l:Mj], are finite 
constants; in (a) we used the fact that 9i,j(-) is bounded and 
nonvanishing on Ci j ; ; in (b) [rrijj] k denotes the kth. component of 
the vector m, j ; in (c) we used the triangle inequality to bound the 
first term, the second term is finite because hi t j (•) is bounded and 
nonvanishing on Cij\ and in (d) we used J^ J log ( | v | ) dv < oo. 
Combining (|66|), (|67)i, and (|o*8|), we complete the proof. ■ 
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